Unsupervised hierarchical probabilistic segmentation of discrete events

نویسندگان

  • Guy Shani
  • Asela Gunawardana
  • Christopher Meek
چکیده

Segmentation, the task of splitting a long sequence of symbols into chunks, can provide important information about the nature of the sequence that is understandable to humans. We focus on unsupervised segmentation, where the algorithm never sees examples of successful segmentation, but still needs to discover meaningful segments. In this paper we present an unsupervised learning algorithm for segmenting sequences of symbols or categorical events. Our algorithm hierarchically builds a lexicon of segments and computes a maximum likelihood segmentation given the current lexicon. Thus, our algorithm is most appropriate to hierarchical sequences, where smaller segments are grouped into larger segments. Our probabilistic approach also allows us to suggest conditional entropy as a measure of the quality of a segmentation in the absence of labeled data. We compare our algorithm to two previous approaches from the unsupervised segmentation literature, showing it to provide superior segmentation over a number of benchmarks. Our specific motivation for developing this general algorithm is to understand the behavior of software programs after deployment by analyzing their traces. We explain and motivate the importance of this problem, and present segmentation results from the interactions of a web service and its clients.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extraction and 3D Segmentation of Tumors-Based Unsupervised Clustering Techniques in Medical Images

Introduction The diagnosis and separation of cancerous tumors in medical images require accuracy, experience, and time, and it has always posed itself as a major challenge to the radiologists and physicians. Materials and Methods We Received 290 medical images composed of 120 mammographic images, LJPEG format, scanned in gray-scale with 50 microns size, 110 MRI images including of T1-Wighted, T...

متن کامل

Tree Structured Dirichlet Processes for Hierarchical Morphological Segmentation

This article presents a probabilistic hierarchical clustering model for morphological segmentation. In contrast to existing approaches to morphology learning, our method allows learning hierarchical organization of word morphology as a collection of tree structured paradigms. The model is fully unsupervised and based on the hierarchical Dirichlet process (HDP). Tree hierarchies are learned alon...

متن کامل

STRUCTURED GRAPHICAL MODELS FOR UNSUPERVISED IMAGE SEGMENTATION By KITTIPAT KAMPA A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy STRUCTURED GRAPHICAL MODELS FOR UNSUPERVISED IMAGE SEGMENTATION By Kittipat Kampa Dec 2011 Chair: Jose C. Principe Major: Electrical and Computer Engineering In the dissertation, we seek the following goals: (1) to come up with a probabi...

متن کامل

Unsupervised Texture Image Segmentation Using MRFEM Framework

Texture image analysis is one of the most important working realms of image processing in medical sciences and industry. Up to present, different approaches have been proposed for segmentation of texture images. In this paper, we offered unsupervised texture image segmentation based on Markov Random Field (MRF) model. First, we used Gabor filter with different parameters’ (frequency, orientatio...

متن کامل

Efficient Texture Segmentation by Hierarchical Multiple Markov Chain Model

A novel multiscale texture model and a related algorithm for the unsupervised segmentation of medical images to locate tumors are proposed in this project. Elementary textures are characterized by their spatial interactions with neighboring regions along selected directions. Such interactions are modeled, in turn, by means of a set of Markov chains, one for each direction, whose parameters are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Intell. Data Anal.

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2011